pacman::p_load(sf, sfdep, tmap, tidyverse, knitr)Take-home Exercise 1: Geospatial Analytics for Public Good
1. Overview
The digitization of urban infrastructures, such as buses and public utilities, generates datasets that, when analyzed through technologies like GPS, reveal valuable patterns in human movement. This information enhances urban management and aids both private and public transport providers in making informed decisions for a competitive edge in the evolving urban landscape. Hence there is value in performing analysis on such information, which is further showcased in this exercise.
1.1 Objectives
In this exercise, the focus will be to identify, analyse and compare the spatial patterns, through using of Geovisualisation and Analysis, and Local Indicators of Spatial Association (LISA) to uncover the spatial and spatio-temporal mobility patterns of public bus passengers in Singapore.
The specific tasks of this exercise are as follows:
Objective 1: Geovisualisation and Analysis
With reference to the time intervals provided in the table below, compute the passenger trips generated by origin at the hexagon level,
Peak hour period Bus tap on time Weekday morning peak 6am to 9am Weekday afternoon peak 5pm to 8pm Weekend/holiday morning peak 11am to 2pm Weekend/holiday evening peak 4pm to 7pm Display the geographical distribution of the passenger trips by using appropriate geovisualisation methods,
Describe the spatial patterns revealed by the geovisualisation (not more than 200 words per visual).
Objective 2: Local Indicators of Spatial Association (LISA) Analysis
Compute LISA of the passengers trips generate by origin at hexagon level.
Display the LISA maps of the passengers trips generate by origin at hexagon level. The maps should only display the significant (i.e. p-value < 0.05)
With reference to the analysis results, draw statistical conclusions (not more than 200 words per visual).
Objective 2: Emerging Hot Spot Analysis(EHSA)
With reference to the passenger trips by origin at the hexagon level for the four time intervals given above:
Perform Mann-Kendall Test by using the spatio-temporal local Gi* values,
Prepared EHSA maps of the Gi* values of the passenger trips by origin at the hexagon level. The maps should only display the significant (i.e. p-value < 0.05).
With reference to the EHSA maps and data visualisation prepared, describe the spatial patterns reveled. (not more than 250 words per cluster).Emerging Hot Spot Analysis(EHSA)
With reference to the passenger trips by origin at the hexagon level for the four time intervals given above:
Perform Mann-Kendall Test by using the spatio-temporal local Gi* values,
Prepared EHSA maps of the Gi* values of the passenger trips by origin at the hexagon level. The maps should only display the significant (i.e. p-value < 0.05).
With reference to the EHSA maps and data visualisation prepared, describe the spatial patterns reveled. (not more than 250 words per cluster).
2. Installing and Loading the R Packages
For this exercise, in the following code chunk, p_load() from pacman package is used to install and load the following R packages into the R environment:
sf for importing, managing, and processing geospatial data,
sfdep to compute spatial weights using appropriate functions
tmap for creating thematic maps,
tidyverse for performing data science tasks such as importing, wrangling and visualising data,
knitr for creating html table.
3. Data Preparation
3.1 Data
There are 2 datasets used, as outlined in sections 3.1 and 3.2
3.1.1 Apastial data
For the purpose of this take-home exercise, Passenger Volume by Origin Destination Bus Stops downloaded from LTA DataMall will be used. Relevant data for the month of August, September and Oct 2023 had been downloaded. For this exercise, we will observe this dataset for the time period of Oct 2023. This can reproduced later on by point to the csv file for the required month.
3.1.2 Geospatial data
Two geospatial data will be used in this study, they are:
Bus Stop Location from LTA DataMall. It provides information about all the bus stops currently being serviced by buses, including the bus stop code (identifier) and location coordinates.
hexagon, a hexagon layer of 250m (this distance is the perpendicular distance between the centre of the hexagon and its edges or in short “Apothem distance” ) should be created using Bus Stop Location and Passenger Volume by Origin Destination Bus Stops data.
3.2 Getting the Data Into R Environment
3.2.1 Importing the aspatial data
We will import the Passenger Volume By Origin Destination Bus Stops data set for Oct 2023 downloaded from LTA Datamall by using read_csv().
odbus <- read_csv("data/aspatial/origin_destination_bus_202310.csv")Using glimpse(odbus), we can see that there are 5,694,297 rows of datapoints in this dataset.
A quick check of odbus tibble data frame also shows that the values in ORIGIN_PT_CODE and DESTINATON_PT_CODE are already in <char> type. Hence, no adjustments needed to the data type.
glimpse(odbus)Rows: 5,694,297
Columns: 7
$ YEAR_MONTH <chr> "2023-10", "2023-10", "2023-10", "2023-10", "2023-…
$ DAY_TYPE <chr> "WEEKENDS/HOLIDAY", "WEEKDAY", "WEEKENDS/HOLIDAY",…
$ TIME_PER_HOUR <dbl> 16, 16, 14, 14, 17, 17, 17, 7, 14, 14, 10, 20, 20,…
$ PT_TYPE <chr> "BUS", "BUS", "BUS", "BUS", "BUS", "BUS", "BUS", "…
$ ORIGIN_PT_CODE <chr> "04168", "04168", "80119", "80119", "44069", "2028…
$ DESTINATION_PT_CODE <chr> "10051", "10051", "90079", "90079", "17229", "2014…
$ TOTAL_TRIPS <dbl> 3, 5, 3, 5, 4, 1, 24, 2, 1, 7, 3, 2, 5, 1, 1, 1, 1…
3.2.2 Extracting the relevant study data from the aspatial data
For the purpose of this exercise, we will extract commuting flows based on the intervals denoted in section 1.1
However, I realized that the weekday afternoon peak stats later than weekend/holiday evening peak.
For purposes of consistency and comparison, I will rename peak hour period “Weekday afternoon peak” to “Weekday evening peak”
| Peak hour period | Bus tap on time | Corresponding Table Name |
|---|---|---|
| Weekday morning peak | 6am to 9am | origin6_9 |
| 5pm to 8pm | origin17_20 | |
| Weekend/holiday morning peak | 11am to 2pm | origin11_14 |
| Weekend/holiday evening peak | 4pm to 7pm | origin16_19 |
The following data wrangling and transformation functions will be used:
origin6_9 <- odbus %>%
filter (DAY_TYPE == "WEEKDAY") %>%
filter (TIME_PER_HOUR >= 6 & TIME_PER_HOUR <=9) %>%
group_by(ORIGIN_PT_CODE) %>%
summarise(TRIPS = sum(TOTAL_TRIPS))
origin17_20 <- odbus %>%
filter (DAY_TYPE == "WEEKDAY") %>%
filter (TIME_PER_HOUR >= 17 & TIME_PER_HOUR <=20) %>%
group_by(ORIGIN_PT_CODE) %>%
summarise(TRIPS = sum(TOTAL_TRIPS))
origin11_14 <- odbus %>%
filter (DAY_TYPE == "WEEKENDS/HOLIDAY") %>%
filter (TIME_PER_HOUR >= 11 & TIME_PER_HOUR <=14) %>%
group_by(ORIGIN_PT_CODE) %>%
summarise(TRIPS = sum(TOTAL_TRIPS))
origin16_19 <- odbus %>%
filter (DAY_TYPE == "WEEKENDS/HOLIDAY") %>%
filter (TIME_PER_HOUR >= 16 & TIME_PER_HOUR <=19) %>%
group_by(ORIGIN_PT_CODE) %>%
summarise(TRIPS = sum(TOTAL_TRIPS))We will save the output in rds format for future use.
write_rds(origin6_9, "data/rds/origin6_9.rds")
write_rds(origin17_20, "data/rds/origin17_20.rds")
write_rds(origin11_14, "data/rds/origin11_14.rds")
write_rds(origin16_19, "data/rds/origin16_19.rds")A check on origin6_9 data table indicates the it is correctly mapped.
origin6_9 <- read_rds("data/rds/origin6_9.rds")
origin17_20 <- read_rds("data/rds/origin17_20.rds")
origin11_14 <- read_rds("data/rds/origin11_14.rds")
origin16_19 <- read_rds("data/rds/origin16_19.rds")kable(head(origin6_9))| ORIGIN_PT_CODE | TRIPS |
|---|---|
| 01012 | 1770 |
| 01013 | 841 |
| 01019 | 1530 |
| 01029 | 2483 |
| 01039 | 2919 |
| 01059 | 1734 |
kable(head(origin17_20))| ORIGIN_PT_CODE | TRIPS |
|---|---|
| 01012 | 8000 |
| 01013 | 7038 |
| 01019 | 3398 |
| 01029 | 9089 |
| 01039 | 12095 |
| 01059 | 2212 |
kable(head(origin11_14))| ORIGIN_PT_CODE | TRIPS |
|---|---|
| 01012 | 2177 |
| 01013 | 1818 |
| 01019 | 1536 |
| 01029 | 3217 |
| 01039 | 5408 |
| 01059 | 1159 |
kable(head(origin16_19))| ORIGIN_PT_CODE | TRIPS |
|---|---|
| 01012 | 3061 |
| 01013 | 2770 |
| 01019 | 1685 |
| 01029 | 4063 |
| 01039 | 7263 |
| 01059 | 1118 |
3.2.3 Exploratory data analysis on data
3.2.3.1 EDA on weekday morning peak data
Exploratory Data Analysis indicates following observations:
#Total trips during weekday morning peak for the month
sum(origin6_9$TRIPS)[1] 25976410
#Maximum trips during weekday morning peak for the month from 1 origin
max(origin6_9$TRIPS)[1] 357043
Checks in data notes that the highest no. of bus trip originates from Boon Lay Interchange (Loc_DESC)
#Minimum trips during weekday morning peak for the month from 1 origin
min (origin6_9$TRIPS)[1] 1
#Median no. of trips during weekday morning peak from 1 origin
median(origin6_9$TRIPS)[1] 2198
#Distribution check
hist(origin6_9$`TRIPS`)
Distribution is skewed heavily to the left.
3.2.3.2 EDA on weekday evening peak data
Exploratory Data Analysis indicates following observations:
#Total trips during weekday evening peak for the month
sum(origin17_20$TRIPS)[1] 25075097
#Maximum trips during weekday evening peak for the month from 1 origin
max(origin17_20$TRIPS)[1] 529081
Checks in data notes that the highest no. of bus trip originates from Boon Lay Interchange (Loc_DESC)
#Minimum trips during weekday evening peak for the month from 1 origin
min (origin17_20$TRIPS)[1] 1
#Median no. of trips during weekday evening peak from 1 origin
median(origin17_20$TRIPS)[1] 1972
hist(origin17_20$`TRIPS`)
Distribution is skewed heavily to the left.
3.2.3.3 EDA on Weekend/holiday morning peak data
Exploratory Data Analysis indicates following observations:
#Total trips during Weekend/holiday morning peak for the month
sum(origin11_14$TRIPS)[1] 7737391
#Maximum trips during Weekend/holiday morning peak for the month from 1 origin
max(origin11_14$TRIPS)[1] 103358
Checks in data notes that the highest no. of bus trip originates from Boon Lay Interchange (Loc_DESC)
#Minimum trips during Weekend/holiday morning peak for the month from 1 origin
min (origin11_14$TRIPS)[1] 1
#Median no. of trips during Weekend/holiday morning peak from 1 origin
median(origin11_14$TRIPS)[1] 666
hist(origin11_14$`TRIPS`)
Distribution is skewed heavily to the left.
3.2.3.4 EDA on Weekend/holiday evening peak data
Exploratory Data Analysis indicates following observations:
#Total trips during Weekend/holiday evening peak for the month
sum(origin16_19$TRIPS)[1] 7769394
#Maximum trips during Weekend/holiday evening peak for the month from 1 origin
max(origin16_19$TRIPS)[1] 144903
Checks in data notes that the highest no. of bus trip originates from Boon Lay Interchange (Loc_DESC)
#Minimum trips during Weekend/holiday evening peak for the month from 1 origin
min (origin16_19$TRIPS)[1] 1
#Median no. of trips during Weekend/holiday evening peak from 1 origin
median(origin16_19$TRIPS)[1] 629
hist(origin16_19$`TRIPS`)
Distribution is skewed heavily to the left.
3.2.3.5 EDA comparison across the different peak periods
Some observations noted for bus trips in Oct 2023:
All distributions are heavily skewed to the left.
Total bus trips for the month for both weekday peaks for both morning and afternoon are quite similar, 26,088,201 (morning) vs 25,151,947 (evening)
Total bus trips for the month for both weekend/holiday peaks for both morning and evening are quite similar, 7,766,064 (morning) vs 7,797,319 (evening)
Total bus trips during weekday peaks are circa 3 times more than bus trips during weekend/holiday peaks.
From the available data, the highest number of trips (529,081) for the month from across all peak periods occurred during weekday evening peak, and it originated from Boon Lay Interchange.
During all peak periods, the highest number of bus trips originates from Boon Lay Interchange. Based on public research, it is situated within Jurong Point Shopping Mall and integrated with the nearby Boon Lay MRT Station. This interchange also serves a variety of passengers, including those from Nanyang Technological University, Jurong Industrial Estate and Tuas Industrial Estate. It also has 27 bus services and it is among the largest and busiest bus interchange in Singapore with an affluence of more than 85,000 commuters every day. [1] This supports the observation that a high number of bus trips originates from this location.
3.2.3 Importing the geospatial data
The code chunk below
uses the st_read() function of sf package to import
Busstopshapefile into R as a simple feature data frame calledbusstopAssigning EPSG code 3414 to busstop data frame.
busstop <- st_read(dsn = "data/geospatial", layer = "BusStop") %>%
st_transform(crs = 3414)Reading layer `BusStop' from data source
`D:\starweb84\ISSS624\Take-home_Ex\Take-home_Ex1\data\geospatial'
using driver `ESRI Shapefile'
Simple feature collection with 5161 features and 3 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 3970.122 ymin: 26482.1 xmax: 48284.56 ymax: 52983.82
Projected CRS: SVY21
st_crs(busstop)Coordinate Reference System:
User input: EPSG:3414
wkt:
PROJCRS["SVY21 / Singapore TM",
BASEGEOGCRS["SVY21",
DATUM["SVY21",
ELLIPSOID["WGS 84",6378137,298.257223563,
LENGTHUNIT["metre",1]]],
PRIMEM["Greenwich",0,
ANGLEUNIT["degree",0.0174532925199433]],
ID["EPSG",4757]],
CONVERSION["Singapore Transverse Mercator",
METHOD["Transverse Mercator",
ID["EPSG",9807]],
PARAMETER["Latitude of natural origin",1.36666666666667,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8801]],
PARAMETER["Longitude of natural origin",103.833333333333,
ANGLEUNIT["degree",0.0174532925199433],
ID["EPSG",8802]],
PARAMETER["Scale factor at natural origin",1,
SCALEUNIT["unity",1],
ID["EPSG",8805]],
PARAMETER["False easting",28001.642,
LENGTHUNIT["metre",1],
ID["EPSG",8806]],
PARAMETER["False northing",38744.572,
LENGTHUNIT["metre",1],
ID["EPSG",8807]]],
CS[Cartesian,2],
AXIS["northing (N)",north,
ORDER[1],
LENGTHUNIT["metre",1]],
AXIS["easting (E)",east,
ORDER[2],
LENGTHUNIT["metre",1]],
USAGE[
SCOPE["Cadastre, engineering survey, topographic mapping."],
AREA["Singapore - onshore and offshore."],
BBOX[1.13,103.59,1.47,104.07]],
ID["EPSG",3414]]
Before moving to the next step, we save the output into rds format as a good practice.
write_rds(busstop, "data/rds/busstop.csv") 3.3 Geospatial data wrangling
3.3.1 Data wrangling with first set of time interval data - Weekday morning peak
In this section, we are going to append the busstop data frame onto origin6_9 data frame first.
origin_data6_9 <- left_join(origin6_9 , busstop,
by = c("ORIGIN_PT_CODE" = "BUS_STOP_N")) %>%
rename(ORIGIN_BS = ORIGIN_PT_CODE)We check for duplicating records
duplicate6_9 <- origin_data6_9 %>%
group_by_all() %>%
filter(n()>1) %>%
ungroup()We use the code chunk below to retain the unique records.
origin_data6_9 <- unique(origin_data6_9)Using head(), we noticed that the origin_data6_9 does not appear to be in simple feature form.
head(origin_data6_9, n=5)# A tibble: 5 × 5
ORIGIN_BS TRIPS BUS_ROOF_N LOC_DESC geometry
<chr> <dbl> <chr> <chr> <POINT [m]>
1 01012 1770 B03 HOTEL GRAND PACIFIC (30140.8 31031.95)
2 01013 841 B05 ST JOSEPH'S CH (30218.75 31126.49)
3 01019 1530 B04 BRAS BASAH CPLX (30187.77 31034.55)
4 01029 2483 B07 OPP NATL LIB (30345.83 31007.64)
5 01039 2919 B09 BUGIS CUBE (30471.08 31175.63)
In the following section, we will create a Choropleth map with a hexagon layer. The code utilises a function called st_make_grid, whcih needs to recognize origin_data as a simple feature form with geospatial information in order to process it. Hence, in anticipation of that, we will convert the data into simple feature form using the below code chunk.
origin_data6_9<- st_as_sf (origin_data6_9)A check indicates it now to be in simple feature form
head(origin_data6_9, n=5)Simple feature collection with 5 features and 4 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 30140.8 ymin: 31007.64 xmax: 30471.08 ymax: 31175.63
Projected CRS: SVY21 / Singapore TM
# A tibble: 5 × 5
ORIGIN_BS TRIPS BUS_ROOF_N LOC_DESC geometry
<chr> <dbl> <chr> <chr> <POINT [m]>
1 01012 1770 B03 HOTEL GRAND PACIFIC (30140.8 31031.95)
2 01013 841 B05 ST JOSEPH'S CH (30218.75 31126.49)
3 01019 1530 B04 BRAS BASAH CPLX (30187.77 31034.55)
4 01029 2483 B07 OPP NATL LIB (30345.83 31007.64)
5 01039 2919 B09 BUGIS CUBE (30471.08 31175.63)
We now replicate the same data wrangling for the other 3 datatables: origin17_20, origin11_14 and origin16_19.
3.3.2 Data wrangling with second set of time interval data - weekday evening peak
3.3.1 For origin17_20:
Append the busstop data frame onto origin17_20 data frame
origin_data17_20 <- left_join(origin17_20 , busstop,
by = c("ORIGIN_PT_CODE" = "BUS_STOP_N")) %>%
rename(ORIGIN_BS = ORIGIN_PT_CODE)We check for duplicating records
duplicate17_20 <- origin_data17_20 %>%
group_by_all() %>%
filter(n()>1) %>%
ungroup()We use the code chunk below to retain the unique records.
origin_data17_20 <- unique(origin_data17_20)Using head(), we noticed that the origin_data17_20 does not appear to be in simple feature form.
head(origin_data17_20, n=5)# A tibble: 5 × 5
ORIGIN_BS TRIPS BUS_ROOF_N LOC_DESC geometry
<chr> <dbl> <chr> <chr> <POINT [m]>
1 01012 8000 B03 HOTEL GRAND PACIFIC (30140.8 31031.95)
2 01013 7038 B05 ST JOSEPH'S CH (30218.75 31126.49)
3 01019 3398 B04 BRAS BASAH CPLX (30187.77 31034.55)
4 01029 9089 B07 OPP NATL LIB (30345.83 31007.64)
5 01039 12095 B09 BUGIS CUBE (30471.08 31175.63)
In the following section, we will create a Choropleth map with a hexagon layer. The code utilises a function called st_make_grid, whcih needs to recognize origin_data as a simple feature form with geospatial information in order to process it. Hence, in anticipation of that, we will convert the data into simple feature form using the below code chunk.
origin_data17_20<- st_as_sf (origin_data17_20)A check indicates it now to be in simple feature form
head(origin_data17_20, n=5)Simple feature collection with 5 features and 4 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 30140.8 ymin: 31007.64 xmax: 30471.08 ymax: 31175.63
Projected CRS: SVY21 / Singapore TM
# A tibble: 5 × 5
ORIGIN_BS TRIPS BUS_ROOF_N LOC_DESC geometry
<chr> <dbl> <chr> <chr> <POINT [m]>
1 01012 8000 B03 HOTEL GRAND PACIFIC (30140.8 31031.95)
2 01013 7038 B05 ST JOSEPH'S CH (30218.75 31126.49)
3 01019 3398 B04 BRAS BASAH CPLX (30187.77 31034.55)
4 01029 9089 B07 OPP NATL LIB (30345.83 31007.64)
5 01039 12095 B09 BUGIS CUBE (30471.08 31175.63)
3.3.3 Data wrangling with third set of time interval data - Weekend/holiday morning peak
3.3.1 For origin11_14:
Append the busstop data frame onto origin11_14 data frame
origin_data11_14 <- left_join(origin11_14 , busstop,
by = c("ORIGIN_PT_CODE" = "BUS_STOP_N")) %>%
rename(ORIGIN_BS = ORIGIN_PT_CODE)We check for duplicating records
duplicate11_14 <- origin_data11_14 %>%
group_by_all() %>%
filter(n()>1) %>%
ungroup()We use the code chunk below to retain the unique records.
origin_data11_14 <- unique(origin_data11_14)Using head(), we noticed that the origin_data11_14 does not appear to be in simple feature form.
head(origin_data11_14, n=5)# A tibble: 5 × 5
ORIGIN_BS TRIPS BUS_ROOF_N LOC_DESC geometry
<chr> <dbl> <chr> <chr> <POINT [m]>
1 01012 2177 B03 HOTEL GRAND PACIFIC (30140.8 31031.95)
2 01013 1818 B05 ST JOSEPH'S CH (30218.75 31126.49)
3 01019 1536 B04 BRAS BASAH CPLX (30187.77 31034.55)
4 01029 3217 B07 OPP NATL LIB (30345.83 31007.64)
5 01039 5408 B09 BUGIS CUBE (30471.08 31175.63)
In the following section, we will create a Choropleth map with a hexagon layer. The code utilises a function called st_make_grid, whcih needs to recognize origin_data as a simple feature form with geospatial information in order to process it. Hence, in anticipation of that, we will convert the data into simple feature form using the below code chunk.
origin_data11_14<- st_as_sf (origin_data11_14)A check indicates it now to be in simple feature form
head(origin_data11_14, n=5)Simple feature collection with 5 features and 4 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 30140.8 ymin: 31007.64 xmax: 30471.08 ymax: 31175.63
Projected CRS: SVY21 / Singapore TM
# A tibble: 5 × 5
ORIGIN_BS TRIPS BUS_ROOF_N LOC_DESC geometry
<chr> <dbl> <chr> <chr> <POINT [m]>
1 01012 2177 B03 HOTEL GRAND PACIFIC (30140.8 31031.95)
2 01013 1818 B05 ST JOSEPH'S CH (30218.75 31126.49)
3 01019 1536 B04 BRAS BASAH CPLX (30187.77 31034.55)
4 01029 3217 B07 OPP NATL LIB (30345.83 31007.64)
5 01039 5408 B09 BUGIS CUBE (30471.08 31175.63)
3.3.4 Data wrangling with fourth set of time interval data - Weekend/holiday evening peak
3.3.1 For origin16_19:
Append the busstop data frame onto origin16_19 data frame
origin_data16_19 <- left_join(origin16_19 , busstop,
by = c("ORIGIN_PT_CODE" = "BUS_STOP_N")) %>%
rename(ORIGIN_BS = ORIGIN_PT_CODE)We check for duplicating records
duplicate16_19 <- origin_data16_19 %>%
group_by_all() %>%
filter(n()>1) %>%
ungroup()We use the code chunk below to retain the unique records.
origin_data16_19 <- unique(origin_data16_19)Using head(), we noticed that the origin_data16_19 does not appear to be in simple feature form.
head(origin_data16_19, n=5)# A tibble: 5 × 5
ORIGIN_BS TRIPS BUS_ROOF_N LOC_DESC geometry
<chr> <dbl> <chr> <chr> <POINT [m]>
1 01012 3061 B03 HOTEL GRAND PACIFIC (30140.8 31031.95)
2 01013 2770 B05 ST JOSEPH'S CH (30218.75 31126.49)
3 01019 1685 B04 BRAS BASAH CPLX (30187.77 31034.55)
4 01029 4063 B07 OPP NATL LIB (30345.83 31007.64)
5 01039 7263 B09 BUGIS CUBE (30471.08 31175.63)
In the following section, we will create a Choropleth map with a hexagon layer. The code utilises a function called st_make_grid, whcih needs to recognize origin_data as a simple feature form with geospatial information in order to process it. Hence, in anticipation of that, we will convert the data into simple feature form using the below code chunk.
origin_data16_19<- st_as_sf (origin_data16_19)A check indicates it now to be in simple feature form
head(origin_data16_19, n=5)Simple feature collection with 5 features and 4 fields
Geometry type: POINT
Dimension: XY
Bounding box: xmin: 30140.8 ymin: 31007.64 xmax: 30471.08 ymax: 31175.63
Projected CRS: SVY21 / Singapore TM
# A tibble: 5 × 5
ORIGIN_BS TRIPS BUS_ROOF_N LOC_DESC geometry
<chr> <dbl> <chr> <chr> <POINT [m]>
1 01012 3061 B03 HOTEL GRAND PACIFIC (30140.8 31031.95)
2 01013 2770 B05 ST JOSEPH'S CH (30218.75 31126.49)
3 01019 1685 B04 BRAS BASAH CPLX (30187.77 31034.55)
4 01029 4063 B07 OPP NATL LIB (30345.83 31007.64)
5 01039 7263 B09 BUGIS CUBE (30471.08 31175.63)
4. Choropleth Visualisation
4.1 Making the Grid
In this section, we will be creating hexagon grids to help in subsequent analysis.
The below code chunk shows the hexagon grid making for the first time interval using the data table origin_data6_9.
4.1.1 Making the Grid for Weekday morning peak data
# Apothem distance is distance from the centre of a regular polygon at right angles to any of its sides. As stated by the data requirement, we set the desired apothem distance at 250
apothem_distance <- 250
# Calculate the side length
side_length <- (2 * apothem_distance) / sqrt(3)
# Use the calculated side length in st_make_grid
area_honeycomb_grid6_9 <- st_make_grid(origin_data6_9, cellsize = c(side_length, side_length), what = "polygons", square = FALSE)
# To sf and add grid ID
honeycomb_grid_sf6_9 = st_sf(area_honeycomb_grid6_9) %>%
# add grid ID
mutate(grid_id = 1:length(lengths(area_honeycomb_grid6_9)))
# Use st_intersection to get a new dataset with intersecting points
intersections_data6_9 <- st_intersection(honeycomb_grid_sf6_9, origin_data6_9)
# Aggregate to sum the "TRIPS" values for each grid
sum_trips6_9 <- aggregate(intersections_data6_9$TRIPS, by = list(intersections_data6_9$grid_id), sum, na.rm = TRUE)
# Merge the result back to the honeycomb grid
colnames(sum_trips6_9) <- c("grid_id", "Sum_TRIPS")
honeycomb_grid_sf6_9 <- merge(honeycomb_grid_sf6_9, sum_trips6_9, by.x = "grid_id", by.y = "grid_id", all.x = TRUE)
# remove grid with value of 0
honeycomb_count6_9 = filter(honeycomb_grid_sf6_9, Sum_TRIPS > 0)This is replicated for the remaining 3 time intervals.
4.1.2 Making the Grid for weekday evening peak data
Show the code
# Apothem distance is distance from the centre of a regular polygon at right angles to any of its sides. As stated by the data requirement, we set the desired apothem distance at 250
apothem_distance <- 250
# Calculate the side length
side_length <- (2 * apothem_distance) / sqrt(3)
# Use the calculated side length in st_make_grid
area_honeycomb_grid17_20 <- st_make_grid(origin_data17_20, cellsize = c(side_length, side_length), what = "polygons", square = FALSE)
# To sf and add grid ID
honeycomb_grid_sf17_20 = st_sf(area_honeycomb_grid17_20) %>%
# add grid ID
mutate(grid_id = 1:length(lengths(area_honeycomb_grid17_20)))
# Use st_intersection to get a new dataset with intersecting points
intersections_data17_20 <- st_intersection(honeycomb_grid_sf17_20, origin_data17_20)
# Aggregate to sum the "TRIPS" values for each grid
sum_trips17_20 <- aggregate(intersections_data17_20$TRIPS, by = list(intersections_data17_20$grid_id), sum, na.rm = TRUE)
# Merge the result back to the honeycomb grid
colnames(sum_trips17_20) <- c("grid_id", "Sum_TRIPS")
honeycomb_grid_sf17_20 <- merge(honeycomb_grid_sf17_20, sum_trips17_20, by.x = "grid_id", by.y = "grid_id", all.x = TRUE)
# remove grid with value of 0
honeycomb_count17_20 = filter(honeycomb_grid_sf17_20, Sum_TRIPS > 0)4.1.3 Making the Grid for Weekend/holiday morning peak data
Show the code
# Apothem distance is distance from the centre of a regular polygon at right angles to any of its sides. As stated by the data requirement, we set the desired apothem distance at 250
apothem_distance <- 250
# Calculate the side length
side_length <- (2 * apothem_distance) / sqrt(3)
# Use the calculated side length in st_make_grid
area_honeycomb_grid11_14 <- st_make_grid(origin_data11_14, cellsize = c(side_length, side_length), what = "polygons", square = FALSE)
# To sf and add grid ID
honeycomb_grid_sf11_14 = st_sf(area_honeycomb_grid11_14) %>%
# add grid ID
mutate(grid_id = 1:length(lengths(area_honeycomb_grid11_14)))
# Use st_intersection to get a new dataset with intersecting points
intersections_data11_14 <- st_intersection(honeycomb_grid_sf11_14, origin_data11_14)
# Aggregate to sum the "TRIPS" values for each grid
sum_trips11_14 <- aggregate(intersections_data11_14$TRIPS, by = list(intersections_data11_14$grid_id), sum, na.rm = TRUE)
# Merge the result back to the honeycomb grid
colnames(sum_trips11_14) <- c("grid_id", "Sum_TRIPS")
honeycomb_grid_sf11_14 <- merge(honeycomb_grid_sf11_14, sum_trips11_14, by.x = "grid_id", by.y = "grid_id", all.x = TRUE)
# remove grid with value of 0
honeycomb_count11_14 = filter(honeycomb_grid_sf11_14, Sum_TRIPS > 0)4.1.4 Making the Grid for Weekend/holiday evening peak data
Show the code
# Apothem distance is distance from the centre of a regular polygon at right angles to any of its sides. As stated by the data requirement, we set the desired apothem distance at 250
apothem_distance <- 250
# Calculate the side length
side_length <- (2 * apothem_distance) / sqrt(3)
# Use the calculated side length in st_make_grid
area_honeycomb_grid16_19 <- st_make_grid(origin_data16_19, cellsize = c(side_length, side_length), what = "polygons", square = FALSE)
# To sf and add grid ID
honeycomb_grid_sf16_19 = st_sf(area_honeycomb_grid16_19) %>%
# add grid ID
mutate(grid_id = 1:length(lengths(area_honeycomb_grid16_19)))
# Use st_intersection to get a new dataset with intersecting points
intersections_data16_19 <- st_intersection(honeycomb_grid_sf16_19, origin_data16_19)
# Aggregate to sum the "TRIPS" values for each grid
sum_trips16_19 <- aggregate(intersections_data16_19$TRIPS, by = list(intersections_data16_19$grid_id), sum, na.rm = TRUE)
# Merge the result back to the honeycomb grid
colnames(sum_trips16_19) <- c("grid_id", "Sum_TRIPS")
honeycomb_grid_sf16_19 <- merge(honeycomb_grid_sf16_19, sum_trips16_19, by.x = "grid_id", by.y = "grid_id", all.x = TRUE)
# remove grid with value of 0
honeycomb_count16_19 = filter(honeycomb_grid_sf16_19, Sum_TRIPS > 0)4.2 Objective 1: Geovisualisation and analysis of passenger trips
4.2.1 Geovisualisation and analysis of passenger trips during weekday morning peak
The below code chunk creates the Choropleth map using tmap and the hexagon overlay.
tmap_mode("view")
map_honeycomb6_9 = tm_shape(honeycomb_count6_9) +
tm_fill(
col = "Sum_TRIPS",
palette = "Reds",
style = "jenks",
title = "Bus trips during weekday morning peak",
id = "grid_id",
showNA = FALSE,
alpha = 0.6,
popup.vars = c(
"No. of bus trips: " = "Sum_TRIPS"
),
popup.format = list(
Sum_TRIPS = list(format = "f", digits = 0)
)
) +
tm_borders(col = "grey40", lwd = 0.7)
map_honeycomb6_9Geospatial visualization reveals distinctive patterns during weekday morning peaks (6am to 9pm), with a notable surge in the origination of bus trips emanating from residential heartlands like Tampines, Boon Lay, and SengKang. In contrast, fewer bus journeys commence from the bustling city center and industrial zones, where commercial and industrial spaces predominate over residential areas. This phenomenon aligns with the typical morning routine, wherein commuters embark on buses from their homes to workplaces during weekday peak hours.
Upon closer scrutiny, specific zones exhibit exceptionally high trip counts, particularly in proximity to key transit hubs such as Boon Lay Interchange, Woodlands MRT station, the Woodlands checkpoint, and major MRT-bus interchanges. These locations witness a heightened concentration of passengers, suggesting a correlation with areas where individuals assemble to transfer buses for diverse destinations.
While preliminary observations hint at a potential correlation between higher trip numbers and residential heartland areas characterized by numerous HDBs and condominiums, definitive conclusions await the overlay of demographic data. Further analysis incorporating demographic information is essential to validate whether these patterns correlate with income levels or specific residential types, providing a more comprehensive understanding of the dynamics influencing public transit usage in these regions.
4.2.2 Geovisualisation and analysis of passenger trips during weekday evening peak
The below code chunk creates the Choropleth map using tmap and the hexagon overlay.
tmap_mode("view")
map_honeycomb17_20 = tm_shape(honeycomb_count17_20) +
tm_fill(
col = "Sum_TRIPS",
palette = "Reds",
style = "jenks",
title = "Bus trips during weekday evening peak",
id = "grid_id",
showNA = FALSE,
alpha = 0.6,
popup.vars = c(
"No. of bus trips: " = "Sum_TRIPS"
),
popup.format = list(
Sum_TRIPS = list(format = "f", digits = 0)
)
) +
tm_borders(col = "grey40", lwd = 0.7)
map_honeycomb17_20Geospatial visualization exposes a distinctive transit pattern during the weekday evening peak (5 pm to 8 pm), emphasizing the sustained importance of major transit hubs like Boon Lay Interchange and Woodlands MRT station. In the city center, a noticeable increase in bus trips is observed compared to the morning peak, with heightened colors indicating greater activity.
Conversely, residential heartlands experience a decline in originating bus trips during the evening peak, aligning with the post-office hours when commuters return home. Noteworthy exceptions include certain transit hubs, such as Boon Lay Interchange, witnessing a substantial 50% surge compared to the morning. This prompts a vital consideration: assessing whether an increased bus departure frequency is necessary from these hubs to meet the escalated demand during the weekday evening peak. Such analysis is crucial for optimizing transportation services and addressing heightened demand during specific time periods.
4.2.3 Geovisualisation and analysis of passenger trips during Weekend/holiday morning peak
The below code chunk creates the Choropleth map using tmap and the hexagon overlay.
tmap_mode("view")
map_honeycomb11_14 = tm_shape(honeycomb_count11_14) +
tm_fill(
col = "Sum_TRIPS",
palette = "Blues",
style = "jenks",
title = "Bus trips during Weekend/holiday morning peak",
id = "grid_id",
showNA = FALSE,
alpha = 0.6,
popup.vars = c(
"No. of bus trips: " = "Sum_TRIPS"
),
popup.format = list(
Sum_TRIPS = list(format = "f", digits = 0)
)
) +
tm_borders(col = "grey40", lwd = 0.7)
map_honeycomb11_14Geospatial visualization reveals a consistent trend during weekend and holiday morning peaks (11 am to 2 pm), with a notable increase in bus trips originating from transit hubs. However, a distinctive difference is observed in the shades of blue at residential hubs, appearing visibly lighter compared to weekday morning peaks. The overall count of bus trips per zone has significantly decreased in comparison to weekday mornings. The cause of this decline remains uncertain—it could be attributed to a general reduction in weekend travel or a more evenly distributed ridership throughout the day, given the absence of the typical workday morning rush.
The ambiguity arises as weekends often entail fewer people commuting during the morning, possibly due to non-working or half-working days. A comprehensive investigation beyond the peak period is warranted to ascertain whether this decline represents an overall reduction in bus rides across various zones. Delving further into weekend time periods would provide valuable insights into travel patterns and help determine the factors contributing to the observed changes in bus trip frequencies
4.2.4 Geovisualisation and analysis of passenger trips during Weekend/holiday evening peak
The below code chunk creates the Choropleth map using tmap and the hexagon overlay.
tmap_mode("view")
map_honeycomb16_19 = tm_shape(honeycomb_count16_19) +
tm_fill(
col = "Sum_TRIPS",
palette = "Blues",
style = "jenks",
title = "Bus trips during Weekend/holiday evening peak",
id = "grid_id",
showNA = FALSE,
alpha = 0.6,
popup.vars = c(
"No. of bus trips: " = "Sum_TRIPS"
),
popup.format = list(
Sum_TRIPS = list(format = "f", digits = 0)
)
) +
tm_borders(col = "grey40", lwd = 0.7)
map_honeycomb16_19Geospatial visualization illustrates a consistent trend during weekend and holiday evening peaks (4 pm to 7 pm), with an increased number of trips originating from transit hubs. However, a discernible decline in overall bus trips per zone, compared to weekday evenings, raises questions about the reasons behind this reduction. It remains unclear whether this decline is due to fewer people traveling on weekends or if there’s a more evenly distributed ridership throughout the day.To comprehensively understand these dynamics, additional research beyond peak hours during weekends is warranted.
Notably, during this period, heightened activity is observed in zones near Orchard Road and Somerset Road, indicating an increased number of trips, possibly from entertainment and shopping districts. This aligns with the expectation that people, having spent their weekends or holidays in these areas, are heading home during the evening peak. Consequently, bus operators can consider adjusting frequencies to accommodate the higher demand in these entertainment and shopping districts during weekend and holiday evenings.
5. Global and Local Measures of Spatial Association
5.1 Computing distance weights
For the subsequent sections, we will be trying to address Objective 2 as detailed in section 1.1.
The below codechunk is used to compute distance weights for weekday morning peak
wm_idw6_9 <- honeycomb_count6_9 %>%
mutate(nb = st_contiguity(geometry),
wts = st_inverse_distance(nb, geometry,
scale = 1,
alpha = 1),
.before = 1)
wm_idw6_9Simple feature collection with 2769 features and 4 fields
Geometry type: POLYGON
Dimension: XY
Bounding box: xmin: 3825.784 ymin: 26315.43 xmax: 48426.09 ymax: 53148.77
Projected CRS: SVY21 / Singapore TM
First 10 features:
nb wts grid_id Sum_TRIPS
1 0 58 103
2 4 0.003464102 221 52
3 0 225 78
4 2 0.003464102 329 185
5 7, 10 0.003464102, 0.003464102 332 1116
6 0 334 53
7 5, 10 0.003464102, 0.003464102 386 251
8 11 0.003464102 390 60
9 0 438 64
10 5, 7, 16 0.003464102, 0.003464102, 0.003464102 440 257
geometry
1 POLYGON ((3970.122 27815.43...
2 POLYGON ((4403.134 28565.43...
3 POLYGON ((4403.134 30565.43...
4 POLYGON ((4691.809 28565.43...
5 POLYGON ((4691.809 30065.43...
6 POLYGON ((4691.809 31065.43...
7 POLYGON ((4836.147 29815.43...
8 POLYGON ((4836.147 31815.43...
9 POLYGON ((4980.485 29065.43...
10 POLYGON ((4980.485 30065.43...
We duplicate the same for the other 3 peak intervals.
For weekday evening peak:
Show the code
wm_idw17_20 <- honeycomb_count17_20 %>%
mutate(nb = st_contiguity(geometry),
wts = st_inverse_distance(nb, geometry,
scale = 1,
alpha = 1),
.before = 1)
wm_idw17_20Simple feature collection with 2769 features and 4 fields
Geometry type: POLYGON
Dimension: XY
Bounding box: xmin: 3825.784 ymin: 26315.43 xmax: 48426.09 ymax: 53148.77
Projected CRS: SVY21 / Singapore TM
First 10 features:
nb wts grid_id Sum_TRIPS
1 0 58 390
2 4 0.003464102 221 114
3 0 225 291
4 2 0.003464102 329 1905
5 7, 10 0.003464102, 0.003464102 332 2899
6 0 334 241
7 5, 10 0.003464102, 0.003464102 386 297
8 11 0.003464102 390 368
9 0 438 296
10 5, 7, 16 0.003464102, 0.003464102, 0.003464102 440 560
geometry
1 POLYGON ((3970.122 27815.43...
2 POLYGON ((4403.134 28565.43...
3 POLYGON ((4403.134 30565.43...
4 POLYGON ((4691.809 28565.43...
5 POLYGON ((4691.809 30065.43...
6 POLYGON ((4691.809 31065.43...
7 POLYGON ((4836.147 29815.43...
8 POLYGON ((4836.147 31815.43...
9 POLYGON ((4980.485 29065.43...
10 POLYGON ((4980.485 30065.43...
For weekend/holidays morning peak:
Show the code
wm_idw11_14 <- honeycomb_count11_14 %>%
mutate(nb = st_contiguity(geometry),
wts = st_inverse_distance(nb, geometry,
scale = 1,
alpha = 1),
.before = 1)
wm_idw11_14Simple feature collection with 2781 features and 4 fields
Geometry type: POLYGON
Dimension: XY
Bounding box: xmin: 4139.263 ymin: 26315.43 xmax: 48306.56 ymax: 53148.77
Projected CRS: SVY21 / Singapore TM
First 10 features:
nb wts grid_id Sum_TRIPS
1 2 0.003464102 5 26
2 1, 5 0.003464102, 0.003464102 113 26
3 0 117 52
4 0 118 76
5 2 0.003464102 167 187
6 12 0.003464102 224 494
7 13, 15 0.003464102, 0.003464102 280 691
8 0 282 45
9 14 0.003464102 283 5
10 0 330 21
geometry
1 POLYGON ((4283.601 28565.43...
2 POLYGON ((4572.276 28565.43...
3 POLYGON ((4572.276 30565.43...
4 POLYGON ((4572.276 31065.43...
5 POLYGON ((4716.614 28315.43...
6 POLYGON ((4860.951 30065.43...
7 POLYGON ((5005.289 30815.43...
8 POLYGON ((5005.289 31815.43...
9 POLYGON ((5005.289 32315.43...
10 POLYGON ((5149.626 29065.43...
For weekend/holidays evening peak:
Show the code
wm_idw16_19 <- honeycomb_count16_19 %>%
mutate(nb = st_contiguity(geometry),
wts = st_inverse_distance(nb, geometry,
scale = 1,
alpha = 1),
.before = 1)
wm_idw16_19Simple feature collection with 2753 features and 4 fields
Geometry type: POLYGON
Dimension: XY
Bounding box: xmin: 3825.784 ymin: 26315.43 xmax: 48426.09 ymax: 53148.77
Projected CRS: SVY21 / Singapore TM
First 10 features:
nb wts grid_id Sum_TRIPS
1 0 58 56
2 4 0.003464102 221 14
3 0 225 100
4 2 0.003464102 329 346
5 7, 10 0.003464102, 0.003464102 332 634
6 0 334 55
7 5, 10 0.003464102, 0.003464102 386 132
8 11 0.003464102 390 49
9 0 438 53
10 5, 7, 16 0.003464102, 0.003464102, 0.003464102 440 111
geometry
1 POLYGON ((3970.122 27815.43...
2 POLYGON ((4403.134 28565.43...
3 POLYGON ((4403.134 30565.43...
4 POLYGON ((4691.809 28565.43...
5 POLYGON ((4691.809 30065.43...
6 POLYGON ((4691.809 31065.43...
7 POLYGON ((4836.147 29815.43...
8 POLYGON ((4836.147 31815.43...
9 POLYGON ((4980.485 29065.43...
10 POLYGON ((4980.485 30065.43...
Summary of the results notes that there are neighbors with no link for all peak intervals.
For weekday morning peak:
summary(wm_idw6_9$nb)Neighbour list object:
Number of regions: 2769
Number of nonzero links: 9520
Percentage nonzero weights: 0.1241627
Average number of links: 3.438064
64 regions with no links:
1 3 6 9 15 28 39 40 42 48 88 129 164 289 298 359 360 444 447 450 464
516 715 749 787 832 910 987 999 1058 1211 1254 1276 1294 1331 1353 1386
1429 1808 1809 1835 1839 1840 1841 1842 1865 1866 1889 1890 2368 2436
2499 2548 2591 2676 2690 2693 2707 2741 2746 2747 2749 2757 2769
Link number distribution:
0 1 2 3 4 5 6
64 191 504 642 651 511 206
191 least connected regions:
2 4 8 18 23 26 27 30 31 32 36 37 38 43 59 61 73 98 106 125 138 145 149 160 177 178 183 189 194 195 202 203 205 217 218 232 246 271 296 297 299 305 308 310 326 334 351 362 375 377 382 442 458 465 484 493 503 531 571 590 623 641 664 665 667 703 730 773 789 820 821 842 852 853 872 889 899 903 931 938 971 981 993 1010 1014 1025 1045 1093 1094 1102 1120 1128 1129 1130 1141 1163 1164 1170 1173 1179 1183 1188 1194 1201 1223 1225 1235 1237 1240 1255 1271 1277 1278 1288 1290 1303 1319 1328 1330 1334 1335 1347 1354 1355 1378 1387 1435 1460 1475 1487 1503 1525 1526 1750 1783 1820 1873 1891 1892 1913 1965 1969 1980 1992 1993 2009 2029 2031 2058 2143 2144 2161 2190 2212 2232 2242 2250 2251 2272 2301 2308 2309 2338 2367 2383 2390 2393 2405 2411 2415 2427 2456 2469 2482 2498 2528 2542 2554 2645 2666 2706 2708 2709 2714 2719 2720 2732 2738 2740 2759 2768 with 1 link
206 most connected regions:
89 201 207 213 235 254 266 267 279 303 315 323 337 367 373 378 379 384 389 395 396 405 413 416 419 421 427 429 468 492 494 495 496 501 511 514 518 528 547 567 585 618 706 767 777 781 791 831 839 881 882 960 989 990 1000 1001 1012 1013 1021 1041 1042 1043 1053 1067 1074 1113 1116 1132 1143 1184 1209 1243 1258 1267 1272 1280 1292 1358 1416 1437 1438 1452 1455 1456 1457 1468 1477 1479 1491 1502 1512 1513 1522 1567 1589 1594 1607 1611 1624 1631 1638 1646 1647 1649 1650 1664 1665 1666 1671 1688 1689 1690 1691 1699 1708 1710 1718 1722 1731 1739 1740 1744 1755 1770 1771 1777 1785 1786 1797 1799 1818 1830 1844 1847 1855 1940 1942 1957 1959 1970 1971 1985 1988 2002 2011 2013 2014 2018 2026 2027 2034 2041 2042 2055 2056 2061 2065 2066 2078 2079 2089 2090 2104 2119 2125 2131 2132 2135 2146 2150 2158 2171 2194 2235 2236 2238 2246 2253 2254 2267 2269 2282 2283 2296 2299 2303 2318 2329 2337 2354 2372 2375 2473 2485 2494 2503 2509 2513 2523 2628 2629 2649 2650 2681 2696 2727 with 6 links
For weekday evening peak:
summary(wm_idw17_20$nb)Neighbour list object:
Number of regions: 2769
Number of nonzero links: 9520
Percentage nonzero weights: 0.1241627
Average number of links: 3.438064
63 regions with no links:
1 3 6 9 15 28 39 40 42 48 88 129 164 289 297 358 359 443 446 449 463
515 714 748 786 831 909 987 999 1057 1210 1253 1275 1293 1330 1352 1385
1428 1807 1808 1834 1838 1839 1840 1841 1864 1865 1888 1889 2436 2499
2548 2591 2676 2690 2693 2707 2741 2746 2747 2749 2757 2769
Link number distribution:
0 1 2 3 4 5 6
63 194 503 639 654 509 207
194 least connected regions:
2 4 8 18 23 26 27 30 31 32 36 37 38 43 59 61 73 98 106 125 138 145 149 160 177 178 183 189 194 195 202 203 205 217 218 232 246 271 296 298 304 307 309 317 326 334 347 348 350 361 374 376 381 441 457 464 483 492 502 530 570 589 622 640 663 664 666 702 729 772 788 819 820 841 851 852 871 888 898 902 930 937 970 980 993 1010 1014 1025 1038 1044 1092 1093 1101 1119 1127 1128 1129 1140 1162 1163 1169 1172 1178 1182 1187 1193 1200 1222 1224 1234 1236 1239 1254 1270 1276 1277 1287 1289 1302 1318 1327 1329 1333 1334 1346 1353 1354 1377 1386 1434 1459 1474 1486 1502 1524 1525 1749 1782 1819 1872 1890 1891 1912 1964 1968 1979 1991 1992 2008 2028 2030 2057 2142 2143 2160 2189 2211 2231 2241 2249 2250 2271 2300 2307 2308 2337 2366 2367 2382 2393 2405 2411 2415 2427 2456 2469 2482 2498 2528 2542 2554 2645 2666 2706 2708 2709 2714 2719 2720 2732 2738 2740 2759 2768 with 1 link
207 most connected regions:
89 201 207 213 235 254 266 267 279 302 314 323 337 366 372 377 378 383 388 394 395 404 412 415 418 420 426 428 467 491 493 494 495 500 510 513 517 527 546 566 584 617 705 766 776 780 790 830 838 880 881 959 968 989 990 1000 1001 1012 1013 1021 1040 1041 1042 1052 1066 1073 1112 1115 1131 1142 1183 1208 1242 1257 1266 1271 1279 1291 1357 1415 1436 1437 1451 1454 1455 1456 1467 1476 1478 1490 1501 1511 1512 1521 1566 1588 1593 1606 1610 1623 1630 1637 1645 1646 1648 1649 1663 1664 1665 1670 1687 1688 1689 1690 1698 1707 1709 1717 1721 1730 1738 1739 1743 1754 1769 1770 1776 1784 1785 1796 1798 1817 1829 1843 1846 1854 1939 1941 1956 1958 1969 1970 1984 1987 2001 2010 2012 2013 2017 2025 2026 2033 2040 2041 2054 2055 2060 2064 2065 2077 2078 2088 2089 2103 2118 2124 2130 2131 2134 2145 2149 2157 2170 2193 2234 2235 2237 2245 2252 2253 2266 2268 2281 2282 2295 2298 2302 2317 2328 2336 2353 2371 2374 2473 2485 2494 2503 2509 2513 2523 2628 2629 2649 2650 2681 2696 2727 with 6 links
For weekend/holidays morning peak:
summary(wm_idw11_14$nb)Neighbour list object:
Number of regions: 2781
Number of nonzero links: 9482
Percentage nonzero weights: 0.1226021
Average number of links: 3.409565
66 regions with no links:
3 4 8 10 11 22 24 28 35 38 46 198 227 282 295 296 307 361 436 443 447
487 536 614 657 658 695 800 867 947 992 1005 1163 1171 1183 1207 1218
1245 1246 1282 1283 1289 1321 1344 1360 1412 1495 1717 1781 1814 1815
1843 1844 1845 1869 2152 2197 2249 2395 2428 2596 2754 2761 2764 2780
2781
Link number distribution:
0 1 2 3 4 5 6
66 203 498 663 684 444 223
203 least connected regions:
1 5 6 9 16 21 27 34 56 58 80 84 111 130 133 144 147 149 152 154 176 184 192 195 199 214 221 234 241 242 251 281 294 297 298 313 327 328 334 343 344 346 347 414 415 444 445 451 455 456 462 467 470 477 484 489 522 565 587 630 632 655 670 680 685 686 687 712 731 747 754 761 771 798 804 815 830 839 842 879 881 882 907 915 918 973 996 1018 1021 1050 1056 1064 1073 1090 1091 1097 1111 1149 1174 1190 1194 1200 1206 1229 1247 1259 1284 1294 1307 1309 1310 1318 1322 1331 1338 1339 1343 1345 1369 1376 1378 1390 1398 1402 1408 1409 1411 1439 1460 1464 1478 1489 1501 1559 1613 1714 1723 1750 1782 1808 1817 1826 1846 1864 1881 1882 1883 1894 1937 1957 1975 1976 1977 1992 2044 2049 2080 2143 2233 2234 2247 2280 2293 2321 2336 2354 2355 2393 2411 2441 2442 2450 2517 2549 2554 2576 2578 2582 2583 2584 2586 2602 2610 2612 2634 2695 2699 2725 2731 2740 2743 2749 2750 2753 2758 2762 2767 2768 2769 2772 2773 2777 2778 with 1 link
223 most connected regions:
93 113 218 225 231 239 245 255 256 269 288 303 311 351 353 386 393 402 403 405 409 411 418 419 428 466 486 491 499 503 505 513 518 533 534 545 546 563 579 580 591 594 645 661 673 690 736 737 773 785 786 802 818 820 836 859 869 871 887 895 896 931 964 993 1007 1008 1016 1022 1026 1027 1037 1047 1048 1057 1059 1068 1076 1100 1113 1158 1168 1176 1187 1197 1203 1214 1227 1233 1240 1250 1255 1261 1272 1273 1279 1287 1298 1299 1312 1329 1348 1353 1373 1383 1386 1434 1435 1452 1475 1480 1493 1494 1496 1502 1503 1518 1524 1549 1563 1565 1569 1585 1604 1609 1616 1689 1702 1703 1704 1719 1720 1721 1728 1738 1739 1740 1745 1749 1754 1755 1767 1769 1770 1771 1777 1785 1790 1802 1820 1833 1930 1946 1947 1959 1964 1965 1979 1981 1982 1985 1995 1996 1997 2003 2007 2008 2011 2026 2027 2040 2050 2051 2063 2064 2074 2075 2089 2149 2177 2179 2191 2192 2193 2204 2205 2206 2221 2222 2236 2237 2238 2253 2254 2270 2279 2288 2295 2312 2319 2330 2347 2362 2369 2436 2439 2444 2455 2473 2476 2489 2490 2500 2512 2519 2520 2521 2527 2530 2539 2542 2551 2626 2687 with 6 links
For weekend/holidays evening peak:
summary(wm_idw16_19$nb)Neighbour list object:
Number of regions: 2753
Number of nonzero links: 9408
Percentage nonzero weights: 0.1241323
Average number of links: 3.417363
66 regions with no links:
1 3 6 9 15 28 39 40 42 48 87 128 163 286 294 355 356 437 441 442 445
449 463 515 747 784 829 907 984 994 1051 1198 1241 1263 1281 1318 1340
1373 1416 1790 1791 1817 1821 1822 1823 1824 1847 1848 1871 1872 2138
2193 2429 2495 2544 2587 2672 2686 2689 2703 2729 2734 2735 2737 2745
2753
Link number distribution:
0 1 2 3 4 5 6
66 197 522 620 635 511 202
197 least connected regions:
2 4 8 18 23 26 27 30 31 32 36 37 38 43 59 61 73 97 105 124 137 144 148 159 176 177 181 187 192 193 200 201 203 215 216 230 244 268 292 293 295 301 304 305 306 322 330 347 357 366 370 372 377 438 440 448 457 464 483 492 502 530 570 589 622 640 663 664 666 702 728 770 786 817 818 839 849 850 869 886 896 900 928 935 967 977 989 1004 1008 1019 1032 1038 1082 1083 1091 1097 1115 1116 1117 1128 1150 1151 1157 1160 1166 1170 1175 1181 1188 1210 1212 1222 1224 1227 1242 1258 1264 1265 1275 1277 1290 1306 1315 1317 1321 1322 1334 1341 1342 1365 1374 1446 1459 1471 1481 1486 1507 1508 1732 1765 1802 1855 1873 1874 1895 1947 1951 1962 1974 1975 1991 2011 2013 2040 2125 2126 2144 2173 2196 2216 2226 2234 2235 2243 2257 2287 2294 2295 2324 2354 2355 2371 2383 2389 2396 2406 2419 2451 2464 2478 2494 2524 2538 2550 2641 2662 2702 2704 2705 2708 2710 2711 2720 2726 2728 2747 2750 with 1 link
202 most connected regions:
88 199 205 211 233 252 263 264 276 299 311 319 333 362 368 373 374 379 384 390 391 400 408 411 414 416 422 424 467 491 493 494 495 500 510 513 517 527 546 566 584 617 705 765 774 788 828 836 878 879 956 986 995 1007 1034 1035 1036 1046 1060 1066 1101 1104 1119 1130 1171 1196 1230 1245 1254 1259 1267 1279 1345 1403 1423 1424 1438 1441 1442 1443 1454 1461 1463 1475 1485 1495 1496 1504 1549 1571 1576 1589 1593 1606 1613 1620 1628 1629 1631 1632 1646 1647 1648 1653 1670 1671 1672 1673 1681 1690 1692 1700 1704 1713 1721 1722 1726 1737 1752 1753 1759 1767 1768 1779 1781 1800 1812 1826 1829 1837 1922 1924 1939 1941 1952 1953 1967 1970 1984 1993 1995 1996 2000 2008 2009 2016 2023 2024 2037 2038 2043 2047 2048 2060 2061 2071 2072 2086 2101 2107 2113 2114 2117 2128 2132 2141 2154 2177 2219 2220 2222 2230 2237 2238 2252 2254 2267 2268 2281 2285 2289 2304 2315 2323 2340 2346 2360 2363 2430 2469 2481 2490 2499 2505 2509 2519 2624 2625 2645 2646 2677 2692 with 6 links
5.2 Objective 2: Computing and Visualising LISA/HSCA
5.2.1 Computing LISA
An error was encountered when trying to compute LISA given the available data.
LISA6_9 <- wm_idw6_9 %>% mutate(local_moran = local_moran( Sum_TRIPS, nb, wts, nsim = 99)cut) .before = 1) %>% unnest(local_moran)
LISA6_9 ####
Error in stopifnot(): ℹ In argument: local_moran = local_moran(Sum_TRIPS, nb, wts, nsim = 99). Caused by error in cut.default(): ! number of intervals and length of ‘labels’ differ Backtrace: 1. … %>% unnest(local_moran) 14. sfdep::local_moran(Sum_TRIPS, nb, wts, nsim = 99) 15. spdep::localmoran_perm(…) 18. base::cut.default(lx, c(-Inf, lxx, Inf), labels = lbs) 19. base::stop(“number of intervals and length of ‘labels’ differ”) Error in stopifnot(!inherits(x, “sf”), !missing(sf_column_name), !missing(agr)) :
Caused by error in cut.default(): ! number of intervals and length of ‘labels’ differ
As I was not able to proceed on using local_moran() to compute LISA, I have switched over to using HSCA as a means to analyse local patterns of spatial clustering or dispersion in a dataset. Instead of analyszing using categories typically used in LISA maps, which are:
High-High (HH): High values surrounded by high values.
Low-Low (LL): Low values surrounded by low values.
High-Low (HL): High values surrounded by low values.
Low-High (LH): Low values surrounded by high values.
We would now use values of Gi* to determine the intensity of the clustering.
5.3.2 Computing HSCA
The local GI* statistics is computed using the code chunk below for weekday morning peak.
HSCA6_9 <- wm_idw6_9 %>%
mutate(local_Gi = local_gstar_perm(
Sum_TRIPS, nb, wts, nsim = 99),
.before = 1) %>%
unnest(local_Gi)
HSCA6_9Simple feature collection with 2769 features and 12 fields
Geometry type: POLYGON
Dimension: XY
Bounding box: xmin: 3825.784 ymin: 26315.43 xmax: 48426.09 ymax: 53148.77
Projected CRS: SVY21 / Singapore TM
# A tibble: 2,769 × 13
gi_star e_gi var_gi p_value p_sim p_folded_sim skewness kurtosis nb
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <nb>
1 -0.533 4.00e-6 0 NaN NaN 1 0.01 NaN <int>
2 -0.753 1.67e-4 1.07e-7 -0.496 0.620 0.3 0.15 5.17 <int>
3 -0.534 3.03e-6 0 NaN NaN 1 0.01 NaN <int>
4 -0.753 1.57e-4 3.13e-8 -0.863 0.388 0.06 0.03 1.40 <int>
5 -0.879 2.92e-4 1.01e-7 -0.854 0.393 0.04 0.02 1.95 <int>
6 -0.536 2.06e-6 0 NaN NaN 1 0.01 NaN <int>
7 -0.879 2.56e-4 2.02e-7 -0.524 0.601 0.2 0.1 6.98 <int>
8 -0.759 1.53e-4 4.30e-8 -0.729 0.466 0.04 0.02 2.29 <int>
9 -0.535 2.48e-6 0 NaN NaN 1 0.01 NaN <int>
10 -1.03 2.55e-4 5.94e-8 -0.980 0.327 0.12 0.06 1.76 <int>
# ℹ 2,759 more rows
# ℹ 4 more variables: wts <list>, grid_id <int>, Sum_TRIPS <dbl>,
# geometry <POLYGON [m]>
We duplicate the same for the other 3 peak intervals.
For weekday evening peak:
Show the code
HSCA17_20 <- wm_idw17_20 %>%
mutate(local_Gi = local_gstar_perm(
Sum_TRIPS, nb, wts, nsim = 99),
.before = 1) %>%
unnest(local_Gi)
HSCA17_20Simple feature collection with 2769 features and 12 fields
Geometry type: POLYGON
Dimension: XY
Bounding box: xmin: 3825.784 ymin: 26315.43 xmax: 48426.09 ymax: 53148.77
Projected CRS: SVY21 / Singapore TM
# A tibble: 2,769 × 13
gi_star e_gi var_gi p_value p_sim p_folded_sim skewness kurtosis nb
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <nb>
1 -0.361 1.58e-5 0 NaN NaN 1 0.01 NaN <int>
2 -0.474 1.45e-4 1.25e-7 -0.295 0.768 0.88 0.44 7.37 <int>
3 -0.365 1.18e-5 0 NaN NaN 1 0.01 NaN <int>
4 -0.474 2.49e-4 3.30e-7 -0.363 0.717 0.06 0.03 5.96 <int>
5 -0.562 3.12e-4 2.06e-7 -0.577 0.564 0.1 0.05 4.40 <int>
6 -0.367 9.77e-6 0 NaN NaN 1 0.01 NaN <int>
7 -0.562 2.80e-4 6.33e-7 -0.288 0.773 0.4 0.2 7.13 <int>
8 -0.522 1.97e-4 1.47e-7 -0.491 0.624 0.16 0.08 4.40 <int>
9 -0.365 1.20e-5 0 NaN NaN 1 0.01 NaN <int>
10 -0.673 3.34e-4 4.21e-7 -0.453 0.651 0.06 0.03 6.39 <int>
# ℹ 2,759 more rows
# ℹ 4 more variables: wts <list>, grid_id <int>, Sum_TRIPS <dbl>,
# geometry <POLYGON [m]>
For weekend/holidays morning peak:
Show the code
HSCA11_14 <- wm_idw11_14 %>%
mutate(local_Gi = local_gstar_perm(
Sum_TRIPS, nb, wts, nsim = 99),
.before = 1) %>%
unnest(local_Gi)
HSCA11_14Simple feature collection with 2781 features and 12 fields
Geometry type: POLYGON
Dimension: XY
Bounding box: xmin: 4139.263 ymin: 26315.43 xmax: 48306.56 ymax: 53148.77
Projected CRS: SVY21 / Singapore TM
# A tibble: 2,781 × 13
gi_star e_gi var_gi p_value p_sim p_folded_sim skewness kurtosis nb
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <nb>
1 -0.671 2.22e-4 3.24e-7 -0.384 0.701 0.16 0.08 5.56 <int>
2 -0.805 2.82e-4 2.05e-7 -0.599 0.549 0.06 0.03 3.73 <int>
3 -0.469 6.82e-6 0 NaN NaN 1 0.01 NaN <int>
4 -0.465 9.96e-6 0 NaN NaN 1 0.01 NaN <int>
5 -0.651 1.56e-4 4.37e-8 -0.680 0.497 0.18 0.09 3.27 <int>
6 -0.610 2.40e-4 2.72e-7 -0.391 0.696 0.18 0.09 6.28 <int>
7 -0.637 2.54e-4 9.53e-8 -0.554 0.580 0.42 0.21 3.86 <int>
8 -0.471 5.90e-6 0 NaN NaN 1 0.01 NaN <int>
9 -0.673 1.74e-4 7.60e-8 -0.622 0.534 0.1 0.05 3.06 <int>
10 -0.475 2.75e-6 0 NaN NaN 1 0.01 NaN <int>
# ℹ 2,771 more rows
# ℹ 4 more variables: wts <list>, grid_id <int>, Sum_TRIPS <dbl>,
# geometry <POLYGON [m]>
For weekend/holidays evening peak:
Show the code
HSCA16_19 <- wm_idw16_19 %>%
mutate(local_Gi = local_gstar_perm(
Sum_TRIPS, nb, wts, nsim = 99),
.before = 1) %>%
unnest(local_Gi)
HSCA16_19Simple feature collection with 2753 features and 12 fields
Geometry type: POLYGON
Dimension: XY
Bounding box: xmin: 3825.784 ymin: 26315.43 xmax: 48426.09 ymax: 53148.77
Projected CRS: SVY21 / Singapore TM
# A tibble: 2,753 × 13
gi_star e_gi var_gi p_value p_sim p_folded_sim skewness kurtosis nb
<dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <nb>
1 -0.399 7.32e-6 0 NaN NaN 1 0.01 NaN <int>
2 -0.539 1.62e-4 5.72e-8 -0.579 0.563 0.5 0.25 2.93 <int>
3 -0.393 1.31e-5 0 NaN NaN 1 0.01 NaN <int>
4 -0.539 2.39e-4 1.71e-7 -0.520 0.603 0.02 0.01 3.73 <int>
5 -0.632 2.77e-4 2.03e-7 -0.531 0.595 0.06 0.03 4.64 <int>
6 -0.399 7.19e-6 0 NaN NaN 1 0.01 NaN <int>
7 -0.632 2.21e-4 6.12e-8 -0.739 0.460 0.2 0.1 3.67 <int>
8 -0.570 2.15e-4 1.72e-7 -0.509 0.611 0.04 0.02 3.47 <int>
9 -0.400 6.93e-6 0 NaN NaN 1 0.01 NaN <int>
10 -0.749 2.68e-4 9.65e-8 -0.767 0.443 0.04 0.02 3.48 <int>
# ℹ 2,743 more rows
# ℹ 4 more variables: wts <list>, grid_id <int>, Sum_TRIPS <dbl>,
# geometry <POLYGON [m]>
5.2.3 Visualising HSCA
Using the HSCA statistics, we will display the HSCA maps of the passengers trips generate by origin at hexagon level for each peak interval. These maps only display the staistically significant zones (i.e. p-value < 0.05)
For weekday morning peak:
HSCA_sig6_9 <- HSCA6_9 %>%
filter(p_sim < 0.05)
tmap_mode("view")
tm_shape(HSCA6_9) +
tm_polygons() +
tm_borders(alpha = 0.5) +
tm_shape(HSCA_sig6_9) +
tm_fill("gi_star") +
tm_borders(alpha = 0.4)During the weekday morning peak (6am to 9am), significant hot spots and clusters emerge near key transit hubs, specifically around the Woodlands checkpoint, Woodlands MRT, and Boon Lay Interchange. These locations exhibit concentrated high-value clusters, albeit relatively small with 3 to 4 interconnected hexagons. Additionally, a statistically significant warm spot is identified near Clementi MRT, featuring a medium-sized cluster spanning 8 connected hexagons.
In contrast, larger clusters, while less warm, characterize zones near Tampines and Ang Mo Kio bus interchanges, with more than 9 interconnected hexagons. This suggests a notable preference for bus usage among residents in these residential areas during the morning peak, where bus trips were taken likely closer to their homes rather than the interchange itself. To validate these findings, a thorough comparison with data from other modes of public transport in these residential zones is essential. This holistic analysis will help confirm commuting preferences and guide potential adjustments to bus frequencies to better align with the observed demand during the weekday morning peak.
For weekday evening peak:
HSCA_sig17_20 <- HSCA17_20 %>%
filter(p_sim < 0.05)
tmap_mode("view")
tm_shape(HSCA17_20) +
tm_polygons() +
tm_borders(alpha = 0.5) +
tm_shape(HSCA_sig17_20) +
tm_fill("gi_star") +
tm_borders(alpha = 0.4)During the weekday evening peak (5pm to 8pm), significant hot spots and clusters persist near transit hubs, particularly around Woodlands MRT and Boon Lay Interchange. Notably, a hotter spot near Tampines Bus Interchange emerges, but with a reduced number of connected hexagons, indicating a likely preference for transiting to homes via the bus interchange. Further investigation into Tampines bus interchange operations is recommended to assess the adequacy of current bus frequencies and routes.
Contrastingly, the cluster at Woodlands checkpoint diminishes during this period, accompanied by a drop in Gi* value. This suggests a shift in passenger behavior, possibly indicating higher bus usage for transit in and out of the checkpoint during the weekday morning peak and reduced reliance during the evening peak. A comprehensive study on the commuting habits and bus ridership of Singaporeans and Malaysians working across borders can offer valuable insights into this observed drop, informing potential adjustments to transportation strategies.
For weekend/holidays morning peak:
HSCA_sig11_14 <- HSCA11_14 %>%
filter(p_sim < 0.05)
tmap_mode("view")
tm_shape(HSCA11_14) +
tm_polygons() +
tm_borders(alpha = 0.5) +
tm_shape(HSCA_sig11_14) +
tm_fill("gi_star") +
tm_borders(alpha = 0.4)During the weekend/holidays morning peak (11 am to 2 pm), notable hot spots and clusters persist at key transit hubs, including Woodlands checkpoint, Woodlands MRT, Boon Lay Interchange, and newly seen clusters at Ang Mo Kio, Tampines, and Bedok bus interchanges. Woodlands checkpoint exhibits a hotter cluster with increased connected hexagons, suggesting heightened ridership to and from the checkpoint.
Medium-sized clusters are observed near Bugis MRT/Bugis Junction and Bukit Panjang MRT/Junction10, strategically located near shopping malls and places of worship, such as Sri Mrugugan Hill Temple in Bukit Panjang, Sri Krishnan Temple, and a Buddhist temple. This aligns with the likelihood of individuals engaging in religious activities during weekends. With the weekend peak starting at 11 am, post-worship, passengers are likely utilizing transportation from these areas to return home or visit other places. Further demographic studies could provide valuable insights into the commuting habits of these passengers during weekend mornings.
For weekend/holidays evening peak:
HSCA_sig16_19 <- HSCA16_19 %>%
filter(p_sim < 0.05)
tmap_mode("view")
tm_shape(HSCA16_19) +
tm_polygons() +
tm_borders(alpha = 0.5) +
tm_shape(HSCA_sig16_19) +
tm_fill("gi_star") +
tm_borders(alpha = 0.4)During the weekend/holidays evening peak (4 pm to 7 pm), prominent hot spots and clusters persist at key transit hubs, notably Woodlands checkpoint, Woodlands MRT, and Boon Lay Interchange, maintaining consistency with the weekend morning peak. The previously identified hot clusters at Ang Mo Kio, Tampines, and Bedok bus interchanges also endure during the evening peak.
Interestingly, bus trip clusters near Bukit Panjang no longer sustain in the evening, indicating a shift in commuting patterns. Conversely, warm clusters persist near Bugis MRT, a location associated with shopping and entertainment. Additionally, a warm spot is identified near Orchard/Somerset Rd, aligning with the entertainment and shopping district, showcasing continued demand in these areas during the weekend/holiday evening peak hours. Understanding these spatial patterns aids in optimizing transportation services to accommodate the persistent demand at specific transit hubs and entertainment districts..
6. Conclusion
Spatial data analysis alongside clear visualization helps policy makers quickly identify patterns and areas of further investigation. On its own spatial data analysis needs to be coupled with domain expertise and study of the relevant field to provide meaningful discussions. While we identify relationships and clusters through this analysis, we could only make assumptions based on available data. Spatial analysis does however provides a quick understanding of the distribution, relationship and clustering patterns, which will help in further decision making.
7. Future Works
As discussed in various sections, there are information gaps that need to be filled up to provide a meaningful analysis, and recommendations against the findings. Collaboration with the transport authorities to better understand commuter trends and habits with regards to other public transport can be explored to bring about a more holistic view over the spatial information on public transport. Furthermore, this study only takes into the bus rides during peak periods for Oct 2023, and there may be room to expand further to cover more months or a wide coverage time period per month. Comparison of spatial observations with demographic factors can also help to policy makers determine how to tailor its transport system to the nation.